Evaluating multi-core and many-core architectures through accelerating the three-dimensional Lax-Wendroff correction stencil

نویسندگان

  • Yang You
  • Haohuan Fu
  • Shuaiwen Song
  • Maryam Mehri Dehnavi
  • Lin Gan
  • Xiaomeng Huang
  • Guangwen Yang
چکیده

Wave propagation forward modeling is a widely used computational method in oil and gas exploration. The iterative stencil loops in such problems have broad applications in scientific computing. However, executing such loops can be highly time-consuming, which greatly limits their performance and power efficiency. In this paper, we accelerate the forward-modeling technique on the latest multi-core and many-core architectures such as Intel Sandy Bridge CPUs, NVIDIA Fermi C2070 GPUs, NVIDIA Kepler K203 GPUs, and the Intel Xeon Phi co-processor. For the GPU platforms, we propose two parallel strategies to explore the performance optimization opportunities for our stencil kernels. For Sandy Bridge CPUs and MIC, we also employ various optimization techniques in order to achieve the best performance. Although our stencil with 114 component variables poses several great challenges for performance optimization, and the low stencil ratio between computation and memory access is too inefficient to fully take advantage of our evaluated architectures, we manage to achieve performance efficiencies ranging from 4.730% to 20.02% of the theoretical peak. We also conduct cross-platform performance and power analysis (focusing on Kepler GPU and MIC) and the results could serve as insights for users selecting the most suitable accelerators for their targeted applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An alternative formulation of finite difference WENO schemes with Lax-Wendroff time discretization for conservation laws

We develop an alternative formulation of conservative finite difference weighted essentially non-oscillatory (WENO) schemes to solve conservation laws. In this formulation, the WENO interpolation of the solution and its derivatives are used to directly construct the numerical flux, instead of the usual practice of reconstructing the flux functions. Even though this formulation is more expensive...

متن کامل

Auto-tuning Stencil Codes for Cache-Based Multicore Platforms

Auto-tuning Stencil Codes for Cache-Based Multicore Platforms by Kaushik Datta Doctor of Philosophy in Computer Science University of California, Berkeley Professor Katherine A. Yelick, Chair As clock frequencies have tapered off and the number of cores on a chip has taken off, the challenge of effectively utilizing these multicore systems has become increasingly important. However, the diversi...

متن کامل

PARALLELIZATION FRAMEWORK FOR SCIENTIFIC APPLICATION KERNELS ON MULTI-CORE/MANY-CORE PLATFORMS by Liu Peng A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA

ion to allow reasoning about their behavior across a broad range of applications. Programs that are members of a particular class can be implemented differently and the underlying numerical methods may change over time, but the claim is that the underlying 3 patterns have persisted through generations of changes and will remain important into the future. The seven dwarfs defined by Phil Colella...

متن کامل

Evaluating the Application of Reinforcement Correction Factor for Concrete Core Testing

This study investigates the reinforcement correction factor of concrete core in more detail to prepare appropriate outlines for interpretation of results. This investigation aims to minimize uncertainties involved to carry out the more realistic condition assessment of suspect buildings before taking up retrofitting/strengthening measures. For this purpose, an extensive experimental program inc...

متن کامل

A Robust Technique to Make a 2D Advection Solver Tolerant to Soft Faults

We present a general technique to solve Partial Differential Equations, called robust stencils, which make them tolerant to soft faults, i.e. bit flips arising in memory or CPU calculations. We show how it can be applied to a two-dimensional Lax-Wendroff solver. The resulting 2D robust stencils are derived using an orthogonal application of their 1D counterparts. Combinations of 3 to 5 base ste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJHPCA

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2014